Remote monitoring of an industrial control system

Project by George Xie. Project developed 2014-2015, server and user account maintance ongoing.

Artical incomplete! Last updated: November 2015

Background:

Control system 1.0

PID based controllers and teams of technicians working in shifts together to operate a plant in optimal conditions. Large swings in the system often cause the technicians to tempraly change the target value of PID for a faster correction and prevent the system from oprating dangerously. The technicians have sizable bonuses which are deducted from for devation from optimal condtions.

Control system 2.0

A replacement of the PID controller to improve the stablity and performance of the system. This new controller aims to use hundreds of sensors to control tens of actuator in one coherent system.

The new controller operate in discrete real time cycles of 1 per second or faster. The old system is still operational and can be switched back to manually. The new controller is in direct control of safety critical systems.

To aid in the design and optimization of such controller logic a proprietary Function Block Diagram (FBD) based set of tools is used. Using a FBD is less like programing and more like electrical circuit design. It is very well suited to live debuging of a real time system, and lacks many basic features of programing languages such as conditional jumps outside of buildin blocks.

The tools used to design and debug the controller logic include:

The FBD tools use a shared memory .dll to share IO and line values across tools for live debuging and save block interal state for runtime updates. The shared values have statically assigned addresses as displacements in the shared memory. This allow different tools with the same design files to access the same values. There is no higher level synchronization other than the fact that all values are 32 bits on 32 bit machines, and no value should be written to by two different processes (For example when both the live system and histry play back writes to inputs, this is a user error that is prevented by the two processes refusing to start when the other is running; or when two outputs write to the same address, which is prevented by static checks in the graphical editer).

Now Add The Internet! A Bad Idea?

Yes. The interal network of the plants are typically cut off from the Internet for safety reasons. Our controllers, build with commodity pcs, do not connect to any interal network other than the IO switch.

The Business Case

Deployment of controllers needs many days some times up to a month of on site adjustments to build stable systems. The plant's blue prints help with design, but only deployment on site can allow corrections for errors in sensors and adjust to the real operating conditions, only long term monitoring can test the system against the wide range of load conditions a plant encounters.

This stage of on site deployment require a specialized personal, who can only work in one plant at a time since the plants are geographically distributed. This is a major expence and limits the number of contracts the business can bid for at once.

Remote Desktop

The first attemp at remote deployment uses a remote desktop solution. This simplifies the deployment to only require a technial personal to install the appropriate hardware, software, and setup IO mapings for the controller. A mobile data card is used to connect to the internet (although the technicians can not surf the web on production systems, they do have cell reception). Remote desktop is slow and expencive on mobile data, but data fees beat hotel fees, and waiting for remote desktop to update beats traveling. A better solution that uses less data and have an ui that updates instantly to user action is thought after by the business.

Solution:

When investigating the feasibilities of the project I was surprised by how well all the tools took on a microservices with unified communications architcture. The simplicity and ubiquity of the shared memory model gives me simple component to hook the tools. The challenge is in synchronizing the shared memory across the network that is efficient for cost and speed, and works well with the workflow of the tools and users, as well as security.

The tools did not access memory directly, but used getters and setters inside the dll. Simply changing the getters into a remote procedure call does not work since that would freeze the user interface for many network round trips as the getters are access in series. Instead, a get call from the user side is recorded and the old local value, which defaults to 0, is returned immediately. Periodically, a frame of all recent get call addresses is sent to the remotely controlled machine, which returns only the values interested by the user. When a user opens a live interface, it displays updates by repeatedly, once a second, calling the getters with the same addresses. The user will initially see 0 or outdated data displayed, then view updated data as they arrive.

This network component is controlled by a user interface that allows selecting the remote end point, starting and stopping the data synchronization, and changing the update period. The UI also shows two frame counters, for requests and replies, to show how fast values are updatings.

All network communications go through the server using TLS. The server decides which user has rights to which remotely machines. more details related to the usage of TLS is here.